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METHODS AND APPARATUS FOR SCALABLE ARRAY 
PROCESSOR INTERRUPT DETECTION AND RESPONSE 

The present application claims the benefit of U.S. Provisional Application Serial No. 
60/184,529 entitled "Methods and Apparatus for Scalable Array Processor Interrupt 
Detection and Response" filed February 24, 2000 which is incorporated by reference herein 
in its entirety. 
Field of the Invention 

The present invention relates generally to improved techniques for interrupt detection 
and response in a scalable pipelined array processor. More particularly, the present invention 
addresses methods and apparatus for such interrupt detection and response in the context of 
highly parallel scalable pipeline array processor architectures employing multiple processing 
elements, such as the manifold array (ManArray) architecture. 
Background of the Invention 

The typical architecture of a digital signal processor is based upon a sequential model 
of instruction execution that keeps tract of program instruction execution with a program 
counter. When an interrupt is acknowledged in this model, the normal program flow is 
interrupted and a branch to an interrupt handler typically occurs. After the interrupt is 
handled, a return from the interrupt handler occurs and the normal program flow is restarted. 
This sequential model must be maintained in pipelined processors even when interrupts occur 
that modify the normal sequential instruction flow. The sequential model of instruction 
execution is used in the advanced indirect very long instruction word (iVLIW) scalable 
ManArray processor even though multiple PEs operate in parallel each executing up to five 
packed data instructions. The ManArray family of core processors provides multiple cores 
1x1, 1x2, 2x2, 2x4, 4x4, and so on that provide different performance characteristics 
depending upon the number of and type of processor elements (PE) used in the cores. 

Each PE typically contains its own register file and local PE memory, resulting in a 
distributed memory and distributed register file model. Each PE, if not masked off, executes 
instructions in synchronism and in a sequential flow as dictated by the instruction sequence 
fetched by a sequence processor (SP) array controller. The SP controls the fetching of the 
instructions that are sent to all the PEs. This sequential instruction flow must be maintained 
across all the PEs even when interrupts are detected in the SP that modify the instruction 
sequence. The sequence of operations and machine state must be the same whether an 
interrupt occurs or not. In addition, individual PEs can cause errors which can be detected 
and reported by a distributed interrupt mechanism. In a pipelined array processor, 
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determining which instruction, which PE, and which data element in a packed data operation 
may have caused an exception type of interrupt is a difficult task. 

In developing complex systems and debugging of complex programs, it is important 
to provide mechanisms that control instruction fetching, provide single-step operation, 
monitor for internal core and external core events, provide the ability to modify registers, 
instruction memory, VLTW memory (VTM), and data memory, and provide instruction 
address and data address eventpoints. There are two standard approaches to achieving the 
desired observability and controllability of hardware for debug purposes. 

One approach involves the use of scan chains and clock-stepping, along with a 
suitable hardware interface, possibly via a joint test action group (JTAG) interface, to a 
debug control module that supports basic debug commands. This approach allows access on 
a cycle by cycle basis to any resources included in the scan chains, usually registers and 
memory. It relies on the library/process technology to support the scan chain insertion and 
may change with each implementation. 

The second approach uses a resident debug monitor program, which may be linked 
with an application or reside in on-chip read only memory ROM. Debug interrupts may be 
triggered by internal or external events, and the monitor program then interacts with an 
external debugger to provide access to internal resources using the instruction set of the 
processor. 

It is important to note that the use of scan chains is a hardware intensive approach 
which relies on supporting hardware external to the core processor to be available for testing 
and debug. In a system-on-chip (SOC) environment where processing cores from one 
company are mixed with other hardware functions, such as peripheral interfaces possibly 
from other companies, requiring specialized external hardware support for debug and 
development reasons is a difficult approach. In the second approach described above, 
requiring the supporting debug monitor .program be resident widi an application or in an on- 
chip ROM is also not desirable due to ti>e reduction in the application program space. 

Thus, it is recognized that it will be highly advantageous to have a multipie-PE 
synchronized interrupt control and a dynamic debug monitor mechanism provided in a 
scalable processor family of embedded cores based on a single architecture model that uses 
common tools to support software configurable processor designs optimized for 
performance, power, and price across multiple types of applications using standard 
application specific integral circuit (ASIC) processes as discussed further below. 
Summary pf tfre Envefltlpn, 

In one aspect of the present invention, a manifold array (ManArray) architecture is 
adapted to employ the present invention to solve the problem of maintaining the sequential 
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program execution model with interrupts in a highly parallel scalable pipelined array 
processor containing multiple processing elements and distributed memories and register 
files. In this aspect, PE exception interrupts are supported and low latency interrupt 
processing is provided for embedded systems where real time signal processing is required. 
5 In addition, the interrupt apparatus proposed here provides debug monitor functions that 
allow for a debug operation without a debug monitor program being loaded along with or 
prior to loading application code. This approach provides a dynamic debug monitor, in 
which the debug monitor code is dynamically loaded into the processor and executed on any 
debug event that stops the processor, such as a breakpoint or "stop" conimand. The debug 

10 monitor code is unloaded when processing resumes. This approach may also advantageously 
include a static debug monitor as a subset of its operation and it also provides some of the 
benefits of fully external debug control which is found in the scan chain approach. 

Various further aspects of the present invention include effective techniques for 
synchronized interrupt control in the multiple PE environment, interruptible pipelined 2 -cycle 

15 instructions, and condition forwarding techniques allowing interrupts between instructions. 
Further, techniques for address interrupts which provide a range of addresses on a master 
control bus (MCB) to which mailbox data may be written, with each address able to cause a 
different maskable interrupt, are provided. Further, special fetch control is provided for 
addresses in an interrupt vector table (IVT) which allows fetch to occur from within the 

20 memory at the specified address, or from a general coprocessor instruction port, such as the 
debug instruction register (DBIR) at interrupt vector 1 of the Manta implementation of the 
ManArray architecture, by way of example. 

These and other advantages of the present invention will be apparent from the 
drawings and the Detailed Description which follow. 

25 Brief Description of the Drawings 

Fig. 1 illustrates a ManArray 2x2 iVLIW processor which can suitably be employed 
with this invention; 

Fig. 2A illustrates an exemplary encoding and syntax/operation table for a SYSCALL 
instruction in accordance with the present invention; 
30 Fig. 2B illustrates a four mode interrupt transition state diagram; 

Fig. 3 illustrates external and internal interrupt requests to and output from a system 
interrupt select unit in accordance with the present invention; 

Fig. 4 illustrates how a single general purpose interrupt (GPI) bit of an interrupt 
request register (IRR) is generated in accordance with the present invention; 
35 Fig. 5 illustrates how a non maskable interrupt bit in the IRR is generated from an OR 

of its sources; 
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Fig. 6 illustrates how a debug interrupt bit in the IRR is generated from an OR of its 
sources; 

Fig. 7 illustrates an exemplary interrupt vector table (IVT) which may suitably reside 
in instruction memory ; 

Fig. 8 illustrates a SYSCALL instruction vector mapping in accordance with the 
present invention; 

Fig. 9 illustrates the registers involved in interrupt processing; 

Fig. 10A illustrates a sliding interrupt processing pipeline diagram; 

Fig. 10B illustrates interrupt forwarding registers used in the SP and all PEs; 

Fig. IOC illustrates pipeline flow when an interrupt occurs and the saving of flag 
information in saved status registers (SSRs); 

Fig. 10D illustrates pipeline flow for single cycle short instruction words when a user 
mode program is preempted by a GPI; 

Fig. 1 1 illustrates a CE3c encoding description for 3 -bit conditional execution; 

Fig. 12 illustrates a CE2b encoding description for 2-bit conditional execution; 

Fig. 13 illustrates a status and control register 0 (SCRO) bit placement; 

Fig. 14A illustrates a SetCC register 5-bit encoding description for conditional 
execution and PE exception interrupts; 

Fig. 14B illustrates a SetCC register 5-bit encoding description for conditional 
execution and PE exception interrupts; 

Fig. 15 illustrates an alternative implementation for a PE exception interface to the 

SP; 

Fig. 16 illustrates an alternative implementation for PE address generation for a PE 
exception interface to the SP; 

Fig. 17 illustrates aspects of an interrupt vector table for use in conjunction with the 
present invention; 

Fig. 18 illustrates aspects of the utilization of a debug instruction register (DBIR); 
Fig. 19 illustrates aspects of the utilization of DSP control register (DSPCTL); 
Fig. 20 illustrates aspects of the utilization of a debut status register (DBSTAT); 
Figs. 21 and 22 illustrate aspects of the utilization of a debut-data-out (DBDOUT) and 
debut-data-in (DBDIN) register, respectively; and 

Fig. 23 illustrates aspects of an exemplary DSP ManArray residing on an MCB and 

MDB. 

Petallefl Pescriptjon, 

Further details of a presently preferred ManArray core, architecture, and instructions 
for use in conjunction with the present invention are found in U.S. Patent Application Serial 
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No. 08/885,3 10 filed June 30, 1 997, now U.S. Patent No. 6,023,753, U.S. Patent Application 
Serial No. 08/949, 122 filed October 10, 1997, U.S. Patent Application Serial No. 09/169,255 
filed October 9, 1998, U.S. Patent Application Serial No. 09/169,256 filed October 9, 1998, 
U.S. Patent Application Serial No. 09/169,072 filed October 9, 1998, U.S. Patent Application 

5 Serial No. 09/187,539 filed November 6, 1998, U.S. Patent Application Serial No. 

09/205,558 filed December 4, 1998, U.S. Patent Application Serial No. 09/215,081 filed 
December 18, 1998, U.S. Patent Application Serial No. 09/228,374 filed January 12, 1999 
and entitled "Methods and Apparatus to Dynamically Reconfigure the Instruction Pipeline of 
an Indirect Very Long Instruction Word Scalable Processor", U.S. Patent Application Serial 

10 No. 09/238,446 filed January 28, 1999, U.S. Patent Application Serial No. 09/267,570 filed 
March 12, 1999, U.S. Patent Application Serial No. 09/337,839 filed June 22, 1999, U.S. 
Patent Application Serial No. 09/350,191 filed July 9, 1999, U.S. Patent Application Serial 
No. 09/422,015 filed October 21, 1999 entitled "Methods and Apparatus for Abbreviated 
Instruction and Configurable Processor Architecture", U.S. Patent Application Serial No. 

15 09/432,705 filed November 2, 1999 entitled "Methods and Apparatus for Improved Motion 
Estimation for Video Encoding", U.S. Patent Application Serial No. 09/471,217 filed 
December 23, 1 999 entitled "Methods and Apparatus for Providing Data Transfer Control", 
U.S. Patent Application Serial No. 09/472,372 filed December 23, 1999 entitled "Methods 
and Apparatus for Providing Direct Memory Access Control", U.S. Patent Application Serial 

20 No. 09/596,1 03 entitled "Methods and Apparatus for Data Dependent Address Operations 
and Efficient Variable Length Code Decoding in a VLIW Processor" filed June 16, 2000, 
U.S. Patent Application Serial No. 09/598,567 entitled "Methods and Apparatus for 
Improved Efficiency in Pipeline Simulation and Emulation" filed June 21, 2000, U.S. Patent 
Application Serial No. 09/598,564 entitled "Methods and Apparatus for Initiating and 

25 Resynchronizing Multi-Cycle SIMD Instructions" filed June 21, 2000, U.S. Patent 

Application Serial No. 09/598,566 entitled "Methods and Apparatus for Generalized Event 
Detection and Action Specification in a Processor" filed June 21, 2000, and U.S. Patent 
Application Serial No. 09/598,084 entitled "Methods and Apparatus for Establishing Port 
Priority Functions in a VLIW Processor" filed June 21, 2000, U.S. Patent Application Serial 

30 No. 09/599,980 entitled "Methods and Apparatus for Parallel Processing Utilizing a Manifold 
Array (ManArray) Architecture and Instruction Syntax" Filed June 22, 2000, U.S. Patent 
Application Serial No. entitled "Methods and Apparatus for Providing Bit- 
Reversal and Multicast Functions Utilizing DMA Controller" filed February 23, 2001, U.S. 
Patent Application Serial No. entitled "Methods and Apparatus for 

35 Flexible Strength Coprocessing Interface" filed February 23, 2001 , as well as, Provisional 
Application Serial No. 60/1 13,637 entitled "Methods and Apparatus for Providing Direct 
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Memory Access (DMA) Engine" filed December 23, 1998, Provisional Application Serial 
No. 60/1 13,555 entitled "Methods and Apparatus Providing Transfer Control" filed 
December 23, 1998, Provisional Application Serial No. 60/139,946 entitled "Methods and 
Apparatus for Data Dependent Address Operations and Efficient Variable Length Code 
5 Decoding in a VLIW Processor" filed June 18, 1999, Provisional Application Serial No. 
60/140,245 entitled "Methods and Apparatus for Generalized Event Detection and Action 
Specification in a Processor" filed June 21, 1999, Provisional Application Serial No. 
60/140,163 entitled "Methods and Apparatus for Improved Efficiency in Pipeline Simulation 
and Emulation" filed June 21, 1999, Provisional Application Serial No. 60/140, 162 entitled 

10 "Methods and Apparatus for Initiating and Re-Synchronizing Multi-Cycle SIMD 

Instructions" filed June 21, 1999, Provisional Application Serial No. 60/140,244 entitled 
"Methods and Apparatus for Providing One-By-One Manifold Array (lxl ManArray) 
Program Context Control" filed June 21, 1999, Provisional Application Serial No. 
60/140,325 entitled "Methods and Apparatus for Establishing Port Priority Function in a 

1 5 VLIW Processor" filed June 2 1, 1999, Provisional Application Serial No. 60/140,425 entitled 
"Methods and Apparatus for Parallel Processing Utilizing a Manifold Array (ManArray) 
Architecture and Instruction Syntax" filed June 22, 1999, Provisional Application Serial No. 
60/165,337 entitled "Efficient Cosine Transform Implementations on the ManArray 
Architecture" filed November 12, 1999, and Provisional Application Serial No. 60/171,91 1 

20 entitled "Methods and Apparatus for DMA Loading of Very Long Instruction Word 

Memory" filed December 23, 1999, Provisional Application Serial No. 60/184,668 entitled 
"Methods and Apparatus for Providing Bit-Reversal and Multicast Functions Utilizing DMA 
Controller" filed February 24, 2000, Provisional Application Serial No. 60/184,529 entitled 
"Med\ods and Apparatus for Scalable Array Processor Interrupt Detection and Response" 

25 filed February 24, 2000, Provisional Application Serial No. 60/184,560 entitled "Methods 
and Apparatus for Flexible Strength Coprocessing Interface" filed February 24, 2000, 
Provisional Application Serial No. 60/203,629 entitled "Methods and Apparatus for Power 
Control in a Scalable Array of Processor Elements" filed May 12, 2000, Provisional 
Application Serial No. 60/241,940 entitled "Methods and Apparatus for Efficient Vocoder 

30 Implementations" filed October 20, 2000, and Provisional Application Serial No. 60/251,072 
entitled "Methods and Apparatus for Providing Improved Physical Designs and Routing with 
Reduced Capacitive Power Dissipation" filed December 4, 2000, all of which are assigned to 
the assignee of the present invention and incorporated by reference herein in their entirety. 
In a presently preferred embodiment of the present invention, a ManArray 2x2 

35 iVLIW single instruction multiple data stream (SIMD) processor 1 00 as shown in Fig. 1 may 
be adapted as described further below for use in conjunction with the present invention. 
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Processor 100 comprises a sequence processor (SP) controller combined with a processing 
elemcnt-0 (PEO) to form an SP/PEO combined unit 101, as described in further detail in U.S. 
Patent Application Serial No. 09/169,072 entitled "Methods and Apparatus for Dynamically 
Merging an Array Controller with an Array Processing Element". Three additional PEs 151, 
153, and 155 are also utilized to demonstrate the apparatus for scalable array processor 
interrupt detection and response mechanism. Tt is noted that the PEs can be also labeled with 
their matrix positions as shown in parentheses for PEO (PE00) 101, PE1 (PE01)15 1, PE2 
(PE10) 153, andPE3 (PE11) 155. The SP/PEO 101 contains an instruction fetch (I-feteh) 
controller 103 to allow the fetching of short instruction words (SIW) or abbreviated- 
instruction words from a B-bii instruction memory 105, where B is determined by the 
application instruction-abbreviation process to be a reduced number of bits representing 
ManArray native instructions and/or to contain two or more abbreviated instructions as 
further described in U.S. Patent Application Serial No. 09/422,015 filed October 21, 1999 
and incorporated by reference herein in its entirety. If an instruction abbreviation apparatus 
is not used then B is determined by the SIW format. The fetch controller 103 provides the 
typical functions needed in a programmable processor, such as a program counter (PC), a 
branch capability, eventpoint loop operations (see U.S. Provisional Application Serial No. 
60/140,245 entitled "Methods and Apparatus for Generalized Event Detection and Action 
Specification in a Processor" filed June 21, 1999 for further details), and support for 
interrupts. It also provides the instruction memory control which could include an instruction 
cache if needed by an application. In addition, the I-fctch controller 103 dispatches 
instruction words and instruction control information to the other PEs in the system by means 
of a D-bit instruction bus 102. D is determined by the implementation, which for the 
exemplary ManArray coprocessor D=32-bits. The instruction bus 102 may include 
additional control signals as needed in an abbreviated-instruction translation apparatus. 

In this exemplary system 100, common elements are used throughout to simplify the 
explanation, though actual implementations are not limited to this restriction. For example, 
the execution units 131 in the combined SP/PEO 101 can be separated into a set of execution 
units optimized for the control function, for example, fixed point execution units in the SP, 
and the PEO as well as the other PEs can be optimized for a floating point application. For 
the purposes of this description, it is assumed that the execution units 131 are of the same 
type in the SP/PEO and the PEs. In a similar manner, SP/PEO and the other PEs use a five 
instruction slot iVLlW architecture which contains a VLIW memory (VIM) 109 and an 
instruction decode and VIM controller functional unit 107 which receives instructions as 
dispatched from the SP/PEO's I-fetch unit 103 and generates VIM addresses and control 
signals 108 required to access the iVLIWs stored in the VIM. Referenced instruction types 
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are identified by the letters SLAMD in VIM 109, where the letters are matched up with 
instruction types as follows: Store (S), Load (L), Arithmetic Logic Unit or ALU (A), 
Multiply Accumulate Unit or MAU (M), and Data Select Unit or DSU (D). 

The basic concept of loading the iVLIWs is described in more detail in U.S. Patent 
5 Application Serial No. 09/1 87,539 entitled "Methods and Apparatus for Efficient 

Synchronous MTMD Operations with iVLIW PE-to-PE Communication". Also contained in 
the SP/PEO and the other PEs is a common PE configurable register file 127 which is 
described in further detail in U.S. Patent Application Serial No. 09/169,255 entitled "Method 
and Apparatus for Dynamic Instruction Controlled Reconfiguration Register File with 

10 Extended Precision". Due to the combined nature of the SP/PEO the data memory interface 
controller 125 must handle the data processing needs of both the SP controller, with SP data 
in memory 121, and PE0, with PE0 data in memory 123. The SP/PEO controller 125 also is 
the controlling point of the data that is sent over the 32-bit or 64-bit broadcast data bus 126. 
The other PEs, 151, 153, and 155 contain common physical data memory units 123*, 123", 

15 and 123*" though the data stored in them is generally different as required by the local 
processing done on each PE. The interface to these PE data memories is also a common 
design in PEs 1, 2, and 3 and indicated by PE local memory and data bus interface logic 157, 
157* and 157". Interconnecting the PEs for data transfer communications is the cluster 
switch 171 various aspects of which are described in greater detail in U.S. Patent Application 

20 Serial No. 08/885,3 10 entitled "Manifold Array Processor", and U.S. Patent Application 

Serial No. 09/169,256 entitled "Methods and Apparatus for Manifold Array Processing", and 
U.S. Patent Application Serial No. 09/169,256 entitled "Methods and Apparatus for 
ManArray PE-to-PE Switch Control". The interface to a host processor, other peripheral 
devices, and/or external memory can be done in many ways. For completeness, a primary 

25 interface mechanism is contained in a direct memory access (DMA) control unit 181 that 
provides a scalable ManArray data bus (MDB) 1 83 that connects to devices and interface 
units external to the ManArray core. The DMA control unit 181 provides die data flow and 
bus arbitration mechanisms needed for these external devices to interface to the ManArray 
core memories via the multiplexed bus interface represented by line 185. A high level view 

30 of a ManArray control bus (MCB) 191 is also shown in Fig. 1. The ManArray architecture 
uses two primary bus interfaces: the ManArray data bus (MDB), and the ManArray control 
bus (MCB). The MDB provides for high volume data flow in and out of the DSP array. The 
MCB provides a path for peripheral access and control. The width of either bus varies 
between different implementations of ManArray processor cores. The width of the MDB is 

35 set according to the data bandwidth requirements of the array in a given application, as well 
as the overall complexity of the on-chip system. Further details of presently preferred DMA 
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control and coprocessing interface techniques are found in U.S. Application Serial. No. 

and Provisional Application Serial No. 60/184,668 both of which are 

entitled "Methods and Apparatus for Providing Bit-Reversal and Multicast Functions 
Utilizing DMA Controller" and which were filed February 23, 2001 and February 24, 2001, 

5 respectively, and U.S. Application Serial No. and Provisional Application Serial 

No. 60/1 84,560 both entitled "Methods and Apparatus for Flexible Strength Coprocessing 
Interface" filed February 23, 2001 and February 24, 2000, respectively, all of which are 
incorporated by reference in their entirety herein. 
Interrupt Processing 

10 Up to 32 interrupts including general purpose interrupts (GPI-4-GPI-3 1), non- 

maskable interrupts (NMI), and others, are recognized, prioritized, and processed in this 
exemplary ManArray scalable array processor in accordance with the present invention as 
described further below. To begin with, a processor interrupt is an event which causes the 
preemption of the currently executing program in order to initiate special program actions. 
15 Processing an interrupt generally involves the following steps: 

Save the minimum context of the currently executing program, 
Save the current instruction address (or program counter), 
Determine the interrupt service routine (ISR) start address and branch to it, 
Execute the interrupt program code until a "return from interrupt" instruction is 
20 decoded, 

Restore the interrupted program's context, and 
Restore the program counter and resume the interrupted program. 
Interrupts are specified in three primary ways: a classification of the interrupt signals into 
three levels, whether they are asynchronous versus synchronous, and maskable versus non- 
25 maskable. Interrupt level is a classification of interrupt signals where the classification is by 
rank or degree of importance. In an exemplary ManArray system, there are three levels of 
interrupts where 1 is the lowest and 3 the highest. These ManArray interrupts levels are: 
interrupt level 1 is for GPI and SYSCALL; interrupt level 2 is for NMI; and interrupt level 3 
is for Debug. SYSCALL is an instruction which causes the address of an instruction 
30 immediately following SYSCALL to be saved in a general-purpose interrupt link register 
(GPILR) and the PC is loaded with the specified vector from the system vector table. The 
system vector table contains 32 vectors numbered from 0 to 31. Each vector contains a 32- 
bit address used as the target of a SYSCALL. Fig. 2A shows an exemplary encoding 202 and 
a syntax/operation table 204 for a presently preferred SYSCALL instruction. 
35 By design choice, interrupts at one classification level cannot preempt interrupts at 

the same level or interrupts at a higher level, unless this rule is specifically overridden by 
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software, but may preempt interrupts at a lower level. This condition creates a hierarchical 
interrupt structure. Synchronous interrupts occur as a result of instruction execution while 
asynchronous interrupts occur as a result of events external to the instruction processing 
pipeline. Maskable interrupts are those which may be enabled or disabled by software while 
5 non-maskable interrupts may not be disabled, once they have been enabled, by software. 
Interrupt enable/disable bits control whether an interrupt is serviced or not. An interrupt can 
become pending even if it is disabled. 

Interrupt hardware provides for the following: 
Interrupt sources and source selection, 

10 Interrupt control (enable/disable), 

Interrupt mapping: source event-to-ISR, and 
Hardware support for context save/restore. 
These items are discussed further below. 
Interrupt Modes and Priorities 

15 In ManArray processors, there are four interrupt modes of operation not including 

low power modes, and three levels of interrupts which cause the processor to switch between 
modes. The modes shown in the four mode interrupt transition state diagram 200 of Fig. 2B 
are: a user mode 205. a system mode 2 10, an NMI mode 215, and a debug mode 220. User 
mode is the normal mode of operation for an application program, system mode is the mode 

20 of operation associated with handling a first level type of interrupt, such as a GPI or 
S YSCALL. NMI mode is the mode of operation associated with the handling of a non- 
maskable interrupt, for example the processing state associated with a loss of power interrupt, 
and debug mode is the mode of operation associated with the handling of a debut interrupt, 
such as single step and break points. 

25 A processor mode of operation is characterized by the type of interrupts that can, by 

default, preempt it and the hardware support for context saving and restoration. In an 
exemplary ManArray core, there are up to 28 GPI level interrupts that may be pending, GPI- 
04 through GPI-3 1 , with GPI-04 having highest priority and GPI-3 1 lowest when more than 
one GPI is asserted simultaneously. State diagram 200 of Fig. 2B illustrates the processor 

30 modes and how interrupts of each level cause mode transitions. The interrupt hardware 

automatically masks interrupts (disables interrupt service) at the same or lower level once an 
interrupt is accepted for processing (acknowledged). The software may reenable a pending 
interrupt, but this should be done only after copying to memory the registers which were 
saved by hardware when the interrupt being processed was acknowledged, otherwise they 

35 will be overwritten. The default rules are: 
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GPI 233 , SYSCALL 234 5 NMI 232 and debug interrupts 23 1 may preempt a user 
mode 205 program. SYSCALL 234 does this explicitly. 

NMI 237 and debug interrupts 236 may preempt a GPI program (ISR) running in 
system mode 210. 

5 Debug interrupts 238 may preempt an NMI mode 215 program (ISR). 

GPIs save status (PC and flags) and 2-cycle instruction data registers when 
acknowledged. SYSCALL 234 operates the same as a GPI 233 from the standpoint of saving 
state, and uses the same registers as the GPIs 233. 

Debug interrupts 23 1 save status and 2-cycle instruction data registers when they 

10 preempt user mode 205 programs, but save only status information when they preempt 
system mode ISRs 210 or NMI ISRs 215. The stale saved during interrupt processing is 
discussed further below. 

NMI interrupts 237 save status but share the same hardware with system mode 210. 
Therefore, non-maskable interrupts are not fully recoverable to the pre-interrupt state, but the 

15 context in which they occur is saved. 
3 - Interrupt Sources 

There are multiple sources of interrupts to a DSP core, such as the ManArray 
processor described herein. These sources may be divided into two basic types, synchronous 
and asynchronous. Synchronous interrupts are generated as a direct result of instruction 

20 execution within the DSP core. Asynchronous interrupts are generated as a result of other 
system events. Asynchronous interrupt sources may be further divided into external sources 
(those coming from outside the ManArray system core) and internal sources (those coming 
from devices within the system core). Up to 32 intei-rupt signals may be simultaneously 
asserted to the DSP core at any time, and each of these 32 may arise from multiple sources. 

25 A module called the system interrupt select unit (SISU) gathers all interrupt sources and, 
based on its configuration which is programmable in software, selects which of the possible 
32 interrupts may be sent to the DSP core. There is a central interrupt controller 320 shown 
in Fig. 3 called the interrupt control unit (ICU) within the DSP core. One task of the ICU is 
to arbitrate between the 32 pending interrupts which are held in an interrupt request register 

30 (IRR) within the ICU. The ICU arbitrates between pending interrupts in the IRR on each 
cycle. 

Synchronous Interrupt Sources 

One method of initiating an interrupt is by directly setting bits in the interrupt request 
register (IRR) that is located in the DSP interrupt control unit (ICU) 320. This direct setting 
35 may be done by load instructions or DSU COPY or BIT operations. 
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Another method of initiating an interrupt is by using a SYSCALL instruction. This 
SYSCALL initiated interrupt is a synchronous interrupt which operates at the same level as 
GPIs. SYSCALL is a control instruction which combines the features of a call instruction 
with those of an interrupt. The argument to the SYSCALL instruction is a vector number. 
5 This number refers to an entry in the SYSCALL table 800 of Fig. 8 which is located in SP 
.instruction memory starting at address 0x00000080 through address OxOOOOOOFF containing 
32 vectors. A SYSCALL is at the same level as a GPI and causes GPIs to be disabled via the 
general purpose interrupt enable (GIE) bit in status and control register 0 (SCR0). It also 
uses the same interrupt status and link registers as a GPL 

10 Asynchronous Interrupt Sources 

Asynchronous interrupt sources are grouped under their respective interrupt levels, 
Debug, NMI and GPL The address interrupt described further below can generate any of 
these three levels of interrupts. 
Debug and Address Interrupts 

15 Debug interrupt resources include the debug control register, debug instruction 

register and debug breakpoint registers. Examples of debug interrupts in the context of the 
exemplary ManArray processor are for software break points and for single stepping the 
processor. 

Address interrupts are a mechanism for invoking any interrupt by writing to a 
20 particular address on the MCB as listed in table 700 of Fig. 1. When a write is detected to an 
address mapped to an address interrupt, the corresponding interrupt signal is asserted to the 
DSP core interrupt control unit. There are four ranges of 3 '2 byte addresses each of which are 
defined to generate address interrupts. A write to an address in a first range (Range 0) 720 
causes the corresponding interrupt, a single pulse on the wire to the ICU. A write to a second 
25 range (Range 1) 725 causes assertion of the corresponding interrupt signal and also writes the 
data to a register "mailbox" (MBOX1). A write to further ranges (Ranges 2 and 3) 730 and 
735, respectively, has the same effect as a write to Range 1, with data going to register 
mailboxes 2 and 3, respectively. In another example, an address interrupt may be used to 
generate an NMI to the DSP core by writing to one of the addresses associated with an NMI 
30 row 740 and one of the columns 710. For further details, see the interrupt source/vector table 
of Fig. 7 and its discussion below. 
NMI 

The NMI may come from either an internal or external source. It may be invoked by 
either a signal or by an address interrupt. 
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GPI Level Interrupts 

The general purpose interrupts may suitably include, four example, DMA, timer, bus 
errors, external interrupts, and address interrupts. There are four DMA interrupt signals 
(wires), two from each DMA lane controller (LC). LCs are also capable of generating 
5 address interrupts via the MCB. 

A system timer is designed to provide a periodic interrupt source and an absolute time 
reference. 

When a bus master generates a target address which is not acknowledged by a slave 
device, an interrupt may be generated 
10 External interrupts are signals which are inputs to the processor system core interface. 

An address interrupt may be used to generate any GPI to the DSP core, in a similar 
manner to that described above in connection debug and address interrupts. 
Interrupt Selection 

External and internal interrupt signals converge at a system interrupt select unit 
15 (SISU) 310 shown in interrupt interface 300 of Fig. 3. Registers in this unit allow selection 
and control of internal and external interrupt sources for sending to the DSP 1CU. A single 
register, the interrupt source control register (INTSRC) determines if a particular interrupt 
vector will respond to an internal or external interrupt. Fig. 3 shows the interrupt sources 
converging at the SISU 310 and the resulting set of 30 interrupt signals 330 sent to the 
20 interrupt request register (IRR) in the DSP ICU 320. 

Fig. 4 shows logic circuitry 400 to illustrate how a single GPI bit of the interrupt 
request register (IRR) is generated. A core interrupt select register (CISRS) bit 412 selects 
via multiplexer 410 between an external 415 or internal 420 interrupt source. An address 
interrupt 425 enabled by an address interrupt enable register (AIER) 435 or a selected 
25 interrupt source 430 generates the interrupt request 440. Fig. 5 shows logic circuitry 500 
. which illustrates how the NMI bit in the IRR is generated from its sources. Note that the 
sources are Ored (510, 520) together rather than multiplexed allowing any NMI event to pass 
through unmasked. Fig. 6 shows logic circuitry 600 illustrating how the DBG bit in the IRR 
is generated from its sources. Note again that the sources are ORcd (610, 620) together 
30 rather than multiplexed. 

Mapping Interrupts to Interrupt Service Routines (ISRs) 

There are two mechanisms for mapping interrupt events to their associated ISRs. 
Asynchronous interrupts are mapped to interrupt handlers through an interrupt vector table 
(TVT) 700 shown in Fig. 7 which also describes the assignment of interrupt sources to their 
35 corresponding vectors in the interrupt vector table. 
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Software generated SYSCALL interrupts are mapped to interrupt handlers through a 
SYSCALL vector table 800 shown in Fig. 8. The interrupt vector table 700 may 
advantageously reside in a processor instruction memory from address 0x00000000 through 
address 0x0000 007F. It consists of 32 addresses, each of which contains the address of the 
5 first instruction of an ISR corresponding to an interrupt source. 

An example of operation in accordance with the present invention is discussed below. 
Interrupt GPI-04 715 of Fig. 7 has an associated interrupt vector (address pointer) 04 at 
address 0x00000010 in instruction memory which should be initialized to contain the address 
of the first instruction of an ISR for GIP-04. This vector may be invoked by an external 
10 interrupt source, if the external source is enabled in the INTSRC register. In the exemplary 
ManArray processor, when GPI-04 is configured for an internal source, the interrupt may be 
asserted by the DSP system timer. In addition, MCB data writes to addresses 0x00300204, 
0x00300224, 0x00300244, and 0x00300264 will cause this interrupt to be asserted if their 
respective ranges are enabled in the address interrupt enable register (ADIEN). Writes to the 
15 last three addresses will additionally latch data in the corresponding "mailbox" register 
MBOX1, MBOX2, or MBOX3 which can be used for interprocessor communication. 

Fig. 8 shows SYSCALL vector mapping 800. ISRs which are invoked with 
SYSCALL have the same characteristics as GPI ISRs. 
Interrupt Control 

20 Registers involved with interrupt control are shown in register table 900 of Fig. 9. 

Further details of the presently preferred interrupt source control register and the 
address interrupt enable register are shown in the tables below 



25 



30 



10 9 8 



m 



TT 

2 



E 

XX 



T T T 



EEE 



EEEE 



XXX 



T T 



X 



XXX 



TTTT 



EE 

xllxjlxixx 



TT 
1 1 



mlT t 
i j b 
1 



EE 
XX 



E 

XllXX 



T T| 
0 0 
5 



35 



INTSRC Interrupt Source Configuration Register Table 
Reset value: 0x00000000 
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R Reserved 
EXTxx 0 - Internal source 
1 = External source 



ADIEN Address Interrupt Enable Register 
Reset value: 0x00000000 
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AIRX Enable Address Interrupt Range V 

0 = Address Interrupt for range V disabled 

1 = Address Interrupt for range V enabled 



Address interrupts are triggered by writes to specific addresses (mapped to the ManArray 
Control Bus). Each range contains 32 (byte) addresses. When a range's AIR bit is set, a 
write to a particular address in the range causes the corresponding interrupt to be asserted to 
the DSP core. 

Interrupt Processing Specifies 

Interrupt processing involves the following steps: 

1. Interrupt detection, 

2. Interrupt arbitration, 

3. Save essential program state (PC, flags, 2 -cycle target data), 

4. Fetch IVT vector into PC 3 

5. Execute ISR, 

6. Execute RETT, 

7. Restore essential program state, and 

8. Restore PC from appropriate interrupt link register. 

Some specific points of the exemplary ManArray processor implementation are: 
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When multiple interrupts are pending their service order is as follows: Debug, NMI, 
and GPI-04, GPI-05,... etc. 

A SYS CALL instruction, if in decode, will execute as if it were of higher priority 
than any GPL If there is an NMI or Debug interrupt pending, then the SYSCALL ISR will 
5 be preempted after the first instruction is admitted to the pipe (only one instruction of the ISR 
will execute). 

One instruction is allowed to execute at any level before the next interrupt is allowed 
to preempt. This constraint means that if an RETI is executed at the end of a GPI ISR and 
another GPI is pending, then exactly one instruction of the USER level program will execute 
1 0 before the next GPI ' s ISR is fetched. 

The Debug interrupt saves PC, flags and interrupt forwarding registers (IFRs) when it 
is accepted for processing (acknowledged) while in User mode. If it is acknowledged while 
in GPI mode or NMI mode, it will only save PC and flags as it uses the same IFRs as the GPI 
level. 

15 If processing a Debug interrupt ISR, and the Debug IRR bit is set, then an RETI will 

result in exactly one instruction executing before returning to the Debug ISR. 

Load VLIW (LV) instructions are not interruptible and therefore are considered one 
(multi-cycle) instruction. Further details of LV instructions arc provided in U.S. Patent No, 
6,151,668 which is incorporated by reference herein in its entirety. 
20 Interrupt Pipeline Diagrams 

Fig. 10A depicts an interrupt pipeline diagram 1000 that can be used to depict the 
events that happen in an instruction flow when an interrupt occurs. To use the diagram for 
this purpose, follow these directions: 

1 . Cut Fig. 1 OA along dashed line 1 002, and 
25 2. Slide "instruction stream" 10-17 1030 under execution units fetch (F), decode 

(DEC), execute 1 (Exl), condition return 1/cxecute 2 (CR1/EX2) and condition return 2 
(CR2) to 1032 observe flag generation and condition feedback visually. Fig. 10B illustrates a 
system 1050 with interrupt forwarding registers used in an SP and all PEs with functional 
units, load unit (LU) 1052, store unit (SU) 1054, DSU 1056, ALU 1058, MAU 1060 and 
30 condition generation unit (CGU) 1062. Configurable register file, also known as compute 
register file (CRF) 1064 is also shown. Fig. IOC shows a flag table 1080 illustrating saved 
flag information within the saved status registers (SSRs). 
Fig. 10A is based upon the following assumptions: 

1. Only current flags 1026 and hot conditions 1034 from condition return 1 
35 (CR1) 1004 and hot conditions 1036 from CR2 1006 affect conditional execution. Hot 
conditions are the condition information generated in the last stage of an execution unit's 
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operation and are available in the condition return stage of the pipeline prior to their being 
latched at the end of the condition return stage. The net result of condition generation unit 
(CGU) 1062 condition reduction is labeled "Condex flags" (1038). 

2. Execution unit updates (EX Flag Updates) 1040 do not affect conditional 
5 execution until the instruction which generates them reaches CR1 phase. 

3. Interrupt acknowledge occurs between 13 1008 and 14 1010. On RETI, the 
state of the pipe must be restored so that it appears to 14 as if no interrupt had occurred. 

4. Each execution unit supplies hot condition flags and pipe phase information. 
The CGU 1062 must decode this information into a set of flags from each phase or "no flags" 

10 if a phase does not have an instruction which updates flags. Using this information, it can 
supply the correct "Condex flags" 1038 to the DEC and EX1 in stages 1012 and 1014, and 
update the latched flags 1042 correctly. 

5. Note that the muxes 1016, 101 8 and 1020 represent the logical "selection" 
between flag information from each phase. 

15 Referring to Fig. 1 OA and sliding the instructions 10-17 1 030 right along the execution 

units 1032, interrupt processing proceeds as follows: 

1. When instruction 3 (13) 1008 is in DEC 1012: The interrupt is acknowledged 
The fetch program counter (PC) which contains the address of 14 1010 is saved to the correct 
interrupt link register (ELR). 
20 2. When 13 is in execute 1(EX1) pipeline stage 1014: Update all flags according 

to II 1022 , 12 1023 and 13 1008 normally. Save the Condex flags. These are the "hot" flags 
which are to be supplied to 14 1010 when it is in decode. 

3. When 13 1008 is in CR1 1004: Save the status and control register (SCR0) 
since this might be read by 14 in EX1 and it might have been updated by 13 in EX1. Update 
25 Condex flags based on 12 and 13, and save the Condex flags. These will be fed back to 14 

1010 and 15 1024 and provided as input to flag update mux 1016 (selecting between Condex 
flags and EX Flag Updates). If 13 contains a 2-cycle instruction, execution unit result data 
must be saved to an interrupt forwarding register (IFR). Both ALU 1058 and MAU 1060 
require 64-bit IFRs to save this data. 
30 4. When 13 is in CR2: Since 13 might be a 2-cycle instruction, save CR2 flags 

(shown in figure). These flags will be fed into the CR1/CR2 flag select mux 1020 when 14 
reaches CR1. All other select inputs will by then be supplied by new instructions 14 and 15. 

On the return from interrupt (RETI), the following events occur: 

1 . Restore ILR to fetch PC and fetch 14. 
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2. 14 in DEC: Supply Condex flags that were saved in step 2 above. These flags 
will be used for conditional execution. Restore saved SCRO (from Step 3) since this SCRO is 
read by 14. 

3 . 14 in EX1 ; Supply Condex flags that were saved in Step 3 above for 14 and 15 
5 conditional execution. Condex flags are also supplied to EX/Condex Flag select mux 1016. 

Since 14 provides flag information to the CGU, the CGU determines the proper update based 
on the saved Condex flag information and new 14 EX flag update information. If 2-cycle 
data from 13 was saved, supply this to the write-back path of CRF 1064 via multiplexers 
1065 and 1066. This will update the CRF 1064 unless 14 contains 1 -cycle instructions in the 
10 same unit(s) that 13 used for 2-cycle instructions. 

4. 14 in CR1 : Supply CR2 flags to CR1/CR2 mux 1020, with all other mux 
controls provided normally by CGU based on inputs from instructions (14 and 15) in earlier 
stages. 

5. Done, instruction processing continues normally. 

15 The hardware provides interrupt forwarding registers 1070-1076 as illustrated in the 

system 1050 of Fig. 10B, in the SP and all PEs that are used as follows: 

(1) When an interrupt occurs and is acknowledged, all instructions in the decode 
phase are allowed to proceed through execute. One-cycle instructions are allowed to 
complete and update their target registers and flags. Any two-cycle instructions are allowed 

20 to complete also, but their output, which includes result data, result operand register 
addresses and flag information, is saved in a set of special purpose registers termed the 
"interrupt forwarding registers" (IFRs) 1070-1076 as shown in Fig. 10B, and no update is 
made to the register file (CRF) 1064 or status registers. 

Uniquely, when an interrupt occurs, interface signals are provided to all PEs to 

25 support the following operations independently in each PE dependent upon the local PE 

instruction sequence prior to the interrupt. For example, there can be a different mixture of 
1 -cycle and 2-cycle instructions in each PE at the time of an interrupt and by using this signal 
interface and local information in each PE the proper operation will occur in each PE on the 
return from interrupt, providing synchronized interrupt control in the multiple PE 

30 environment. These interface signals include savc/rcstorc signals, interrupt type, and 
extended or normal pipe status. Specifically, these interface signals are: 
Save SSR state machine state (SP_VCU_s_ssr_state[l:0]) 

These two bits indicate the state of an internal Save Saved Status Register (SSR) state 
machine. The signals represent 4 possible states (IDLE, I4JEX, 15_EX, I6J3X). When not 

35 in the idle state, the Save SSR state machine indicates the phase of the pipe that tbe 

interrupted instruction would be in had an interrupt not occurred. If you consider a sequence 
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of 6 instructions (II, 12,. . ., 16), and the fourth instruction is interrupted, the listed state 
machine labels indicate when the 4 th , 5 th and 6 th instructions would have been in the execute 
phase of the pipeline. This machine state information is used locally in each PE as one of the 
indicators for when the TFRs need to be saved and what state needs to be saved to SSRs. 
Restore SSR state machine state (SP_VCU_r_ssr_state[l:10]) 

These bits indicate the state of an internal Restore SSR state machine. The signals 
represent 4 possible states (IDLE, I4_DC, I5_DC, I6JDC). When. not in the idle state, the 
Restore SSR state machine indicates the phase of the pipe that the interrupted instruction is in 
after it is fetched and put into the pipe again (i.e., from a return from interrupt). If you 
consider a sequence of 6 instructions (H> 12, . . .,16), and the fourth instruction is interrupted, 
the state machine labels indicate when the 4 th , 5 th and 6 th instructions arc in the decode phase 
of the pipeline. This machine state information is used locally in each PE as one of the 
indicators for when the IFRs need to be restored and when state needs to be restored from the 
SSRs. 

Save SSRs (SPjyCU_save_jssr) 

This bit indicates when the SSRs must be saved. 
Transfer System SSRs to User SSRs (SPJv~CU_xfer_ssr) 

This signal indicates the System SSRs must be transferred to the User SSRs. 
Select User SSRs (VCU_sel_gssr) 

This signal indicates which SSRs (System or User SSRs) should be used when 
restoring the SSR to the hot flags and SCRO. It is asserted when restoring flags from the 
System SSRs. 

Extend pipe when returning from Interrupt Service Routine 
(SP_VCU_reti_extend_pipe) 

When asserted, this bit indicates that a return from interrupt will need to extend the 

pipe. 

(2) The address of the instruction in FETCH phase (current PC) is saved to the 
appropriate link register. 

(3) The interrupt handler is invoked through the normal means such as a vector 
table lookup and branch to target address. 

(4) When the RETT instruction is executed, it causes the restoration of the saved 
SCRO and link address from the appropriate link and saved-starus registers. 

(5) When the instruction at the link address reaches the EXECUTE phase, the data 
in the interrupt forwarding registers, fur those units whose last instruction prior to interrupt 
handling was a two-cycle instruction, is made available to the register file 1064 and the CGU 
1062 instead of the data coming from the corresponding unit. From the CGU and register file 
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point of view, this operation has the same behavior that would have occurred if the interrupt 
had never happened. 

Figs. 10C and 10 D illustrate interrupt pipeline diagrams 1080 and 1090 for an 
• example of interrupt processing as currently implemented. The columns SSR Save 1084, 
5 SSR-XFER 1086, OP in Fetch 1088, System Mode 1090 and User Mode 1092 in Fig. 10C 
show the state of the interrupt state machine for each cycle indicated in the cycle column 
1 082. Further, Fig. 10D shows the pipeline state of a bit within the interrupt request register 
(IRR) 1095, the instruction whose address is contained in the interrupt link register (ILR) 
1096, the state of the interrupt status register (ISR) 1097, the state of the GPIE interrupt 
10 enable bit found in SCR0 1098, the interrupt level (ILVL) 1099, and the instruction being 
processed in the set of pipeline stages (fetch (F) 1021, prcdecode (PD) 1023, decode (D) 
1025, execute 1 (EX1) 1027, and condition return (CR) 1029). It is assumed that the 
individually selectable general purpose interrupts are enabled and the interrupt vector number 
that is stored in SCR1 gets updated at the same.time that IMOD is updated in SCR0. 
15 In the present exemplary processes, any time an interrupt is taken, there will be 3 

cycles during which information needed to restore the pipeline is saved away in the saved 
status registers (SSR0, SSR1, and SSR2). The information is saved when the SSR-SAVE 
column 1084 in table 1080 has a "1" in it. The easiest way to understand how the three 32-bit 
SSR registers are loaded is by breaking them down into six 16-bit fields. SSR0 is made up of 
20 the user mode decode phase (UMDP) and user mode execute phase (UMEP) components. 
SSR1 is made up of the user mode condition return phase (UMCP) and system mode 
condition return phase (SMCP) components. SSR2 is made up of the system mode decode 
phase (SMDP) and system mode execute phase (SMEP) components. 

SMCP - System Mode Condition Return Phase ( Upper Half of SSR1) 
25 SMEP - System Mode Execution Phase (Upper Half of SSR2) 

SMDP - System Mode Decode Phase (Lower Half of SSR2) 
UMCP - User Mode Condition Return Phase (Lower Half of SSR1) 
UMEP - User Mode Execute Phase (Upper Half of SSR0) 
UMDP - User Mode Decode Phase (Lower Half of SSR0) 
30 When interrupt processing begins, the data is first stored to the system mode registers. Then, 
depending on the mode of operation before and after the interrupt, the system mode registers, 
may be transferred to the user mode registers. For example, if the mode of operation before 
the interrupt is taken is a USER mode, the SSR-XFER will be asserted. If the SSR-XFER bit 
in column 1086 is asserted, the contents of the system mode registers are transferred to the 
35 user mode registers. 
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In the example shown in Fig. IOC, the floating point subtract (Fsub), a 2-cycle 
instruction, is preempted by an interrupt. The Hot State Flags (HOTSFs) are control bits 
indicating local machine slate in the exemplary implementation and these are as follows: 

HOTSFs = {HOTSF3, HOTSF2,HOTSFl,HOTSF0}; 
5 HOTSF3 = bit indicating that a 2-cycle operation is in execute and it could have 

control of the flag update. 

HOTSF2 = bit indicating that a 2-cycle ALU instruction is in the execute (EX1) 
pipeline stage. 

HOTSF1 = bit indicating that a 2-cycle MAU instruction is in the execute (EX1) 
10 pipeline stage. 

HOTSF0 = bit indicating that a LU or DSU instruction that is targeted at SCRO is in 
the execute (EX1) pipeline stage. 

In cycle 4, 1081, since the SSR-SAVE signal was asserted, the FSub hotflags and hot 
state flags will be saved into SMCP. The SMCP is loaded with the Hotflags, arithmetic 

15 scalar flags (CNVZ) arithmetic condition flags (F0-F7), and the HOTSFs signals for the 
instruction that would be in Execute if the interrupt had not occurred, in this example, the 
FSub. In cycle 5 1083, SMEP is loaded with the contents of SMCP, and SMCP is loaded 
with the current hotflags and the hot state flags from cycle 4. The SMCP is loaded with the 
Hotflags (CNVZ & F0-F7) and the HOTSFs from the previous cycle. In cycle 6 1085, 

20 SMDP gets the contents of SMEP, SMEP gets the contents of SMCP, and SMCP gets loaded 
with the current hotflags, and the hot state flags for cycle 4. The SMCP is loaded with the 
Hotflags (CNVZ & F0-F7) and the HOTSFs from two cycles before 

In cycle 7 1087, since the SSR-XFER signal was asserted in the previous cycle, the 
user mode phase components are loaded with copies of the system mode phase components. 

25 Whenever the SSR-save bit is asserted and a 2-cycle operation (ALU or MAU) is in 

the EX2 pipeline stage, the target compute register of the 2-cycle operation is not updated. 
Rather, the data, address, and write enables, i.e., bits indicating data type are stored in the 
corresponding execution unit forwarding registers. 

In more detail, the pipeline diagram of Fig. 10D depicts the events that occur when a 

30 GPI preempts a user mode program after the fetch of a single cycle subtract (Sub) short 
instruction word with a nonexpanded normal pipe. Note that the SSR-XFER bit 1094 is 
asserted in this case since it is a GPI taking the mode of operation from a user mode 
(ILVI=USR) to a system mode (ILVL=GPI). It would also be asserted when taking an 
interrupt that leaves the mode of operation in the same mode as it was before the interrupt 

35 came along (i.e., nesting general purpose interrupts). For the interrupt request register (IRR) 
1095, the bit corresponding to the interrupt taken is cleared in the IRR. The general purpose 



WO 01/63416 PCT/USOl/06058 

22 

or debug inLerrupt link register (1LR) 1096, holds the address of the instruction that will be 
executed following the interrupt. In Fig. 10D, only one of these registers (GPILR) is shown 
in column 1096. The general purpose or debug interrupt status register (GPISR or DBISR) 
1097 contains a copy of SCRO, so that flag state may be restored following the interrupt. 
5 Here, only one of these registers (GPISR) is shown in column 1 097. Interrupt enable. (IE), 
bits 31-29 of SCRO are GPI enable, NMI enable, and DBI enable - here only the applicable 
enable bit (GPIE) 1 098 is shown. Bits 28 and 27 of SCRO contain the interrupt mode 
(IMode) which encodes the four, user, GPI, NMI, or debug modes. 
CE3c Extension 

10 In the exemplary ManArray processor, a hierarchical conditional execution 

architecture is defined comprising 1-bit, 2-bit, and 3-bit forms. The 1-bit form is a subset of 
the 2-bit and 3-bit forms and the 2-bit form is a subset of the 3-bit form. In the exemplary 
ManArray processor, the load and store units use a CE1 1-bit form, the MAU, ALU, and 
DSU use the 3 -bit CE3 form, though different implementations may use subsets of the 3-bit 

15 form depending upon algorithmic needs. The hierarchical conditional execution architecture 
is further explained in U.S. Patent Application Serial No. 09/238,446 entitled "Methods and 
Apparatus to Support Conditional Execution in a VLIW-Based Array Processor With 
Subword Execution" filed ou January 28, 1999 and incorporated herein in its entirety. 

Two 3-bit forms of conditional execution, CE3a and CE3b, specify how to set the 

20 ACFs using C, N, V, or Z flags. These forms are described in greater detail in the above 
mentioned application. A new 3 -bit form is specified in the present invention and labeled 
CE3c. The N and Z options available in the 3 -bit CE3a delinition are incorporated in the new 
CE3c encoding format 1100 encodings 1105 and 1106 respectively, illustrated in Fig. 11. 
The present invention addresses the adaptation of CE2 to use its presently reserved encoding 

25 for a registered SetCC form of conditional execution. The new form of CE2, which is a 

superset of the previous CE2, is referred to as CE2b whose encoding format is shown in table 
1200 of Fig. 12. A new programmable register is used in conjunction with the CE2b and 
CE3c eucodings and is named the SetCC field of SCRO as addressed further below. These 
bits are used to specify many new combinations of the arithmetic side effect (C, N, V, and Z) 

30 flags to cover exceptions detected in the execution units and to provide enhanced flexibility 
in each of the instructions for algorithmic use. Due to the improved flexibility, it may be 
possible to replace the original 3 -bit CE3a or CE3b with CE3c in future architectures. 
Alternatively, a mode bit or bits of control could be provided and the hardware could then 
support the multiple forms of CE3. These CE3 encodings specify whether an instruction is to 

35 unconditionally execute and not affect the ACFs, conditionally execute on true or false and 
not affect the ACFs, or provide a register specified conditional execution function. The ASFs 
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are set as defined by the instruction. In an exemplary implementation for a ManArray 
processor, the SetCC field of 5-bits 1310 which will preferably be Located in an SCRO 
register 1300 as shown in Fig. 13. The new format of SCRO includes the addition of the 
SetCC bits 12-8 1310, an exception mask bit-13 1315, and the maskable PE exception 
5 interrupt signal bit 20 1 320. C, N, V, Z, cc, SetCC, ccrnask, and F7-F0 bits are always set to 
0 by reset. The proposed SetCC definition shown in encoding table 1400 of Figs. 14A and 
14B, specifies some logical combination of flags such as packed data ORing of flags. The 
encoding also reserves room for floating point exception flags, or the like, for future 
architectures. 

10 A proposed syntax defining the SetCC operations is "OptypeCC" where the CC 

represents the options given in Figs. 14A and 14B for a number of logical combinations of 
the ASFs. The number of ACFs affected is determined by the packed data element count in 
the current instruction and shown in Figs. 14A and 14B. Figs. 14A and 14B specify the use 
of packed data side effect signals C, N, V, and Z for each elemental operation of a multiple 

15 element packed data operation. These packed data side-effect signals are not programmer 

visible in the exemplary ManArray system. Specifically, the C7-C0, N7-N0, V7-V0, and Z7- 
Z0 terms represent internal flag signals pertinent for each data element operation in a packed 
data operation. "Size" is a packed data function that selects the appropriate affected C7-C0, 
N7-N0, V7-V0, and Z7-Z0 terms to be ORed based on the number of data elements involved 

20 in the packed data operation. For example, in a quad operation, the internal signals C3-C0, 
N3-N0, V3-V0, and Z3-Z0 may be affected by the operation and would be involved in the 
ORing while C7-C4, N7-N4, V7-V4, and Z7-Z4 are not affected and would not be involved 
in the specified operation. 

A new form of CE3 conditional execution architecture is next addressed with 

25 reference to Fig. 1 1 . Two of the CE3c encodings 1 103 and 1 104 specify the partial execution 
of packed data operations based upon the ACFs. CE3c also includes the CE2b general 
extension that controls Ihe setting of the ACFs based upon the registered SetCC parameter 
1 102. The proposed CE3c 3 -bit conditional execution architecture in ManArray provides the 
programmer with five different levels of functionality: 

30 1 . unconditional execution of the operation, does not affect the ACFs, 

2. conditional execution of the operation on all packed data elements, does not 
affect the ACFs, 

3. unconditional execution of the operation, ACFs set as specified by the SetCC 
register, 

35 4. conditional selection of data elements for execution, does not affect the ACFs, 

and 
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5. unconditional execution of the operation with control over ACF setting. 
In each case, data elements will be affected by the operation in different ways: 

1. In the first case, the operation always occurs on all data elements. 

2. In the second case, the operation either occurs on all data elements or the 
5 operation does not occur at all. 

3. In the third case, the operation always occurs on all data elements and the 
ACFs are set in the CR phase of this operation. The 01 1 CE3c encoding 1 102 shown in Fig. 
1 1 would allow the ACFs F7-F0 to be set as specified by a SetCC register as seen in Figs. 
14Aand 14B. 

10 4. In the fourth case, the operation always occurs but only acts on those data 

elements that have a corresponding ACF of the appropriate value for the specified true or 
false coding. In this fourth case, the packed data instruction is considered to partially execute 
in that the update of the destination register in the SP or in parallel in the PEs only occurs 
where the corresponding ACF is of the designated condition. 

15 5. In the fifth case, the N and Z flags represent two side effects from the 

instruction that is executing. An instruction may be unconditionally executed and affect the 
flags based on one of the conditions, N or Z. 

The syntax defining the fourth case operations is "Tm" and "Fm," for "true multiple" 
and "false multiple." The "multiple" case uses the packed data element count in the current 

20 instruction to determine the number of flags to be considered in the operation. For example, 
an instruction Tm.add.sa.4h would execute the add instruction on each of the 4 halfwords 
based on the current settings of F0, Fl, F2, and F3. This execution occurs regardless of how 
these four flags were set. This approach enables the testing of one data type with the 
operation on a second data type. For example, one could operate on quad bytes setting flags 

25 F3-F0, then a conditional quad half-word operation can be specified based on F3-F0 

providing partial execution of the packed data type based on the states of F3-F0. Certain 
instructions, primarily those in the MAU aud ALU, allow a conditional execution CE3c 3-bit 
extension field to be specified. 
PE Exception Interrupts 

30 Since the interrupt logic is in an SP, such as the SP 101 , a mechanism to detect 

exceptions and forward the PE exception information to the SP is presented next. In 
addition, a method of determining which instruction caused the exception interrupt, in which 
PE, and in which sub data type operation is also discussed. 

One of the first questions to consider is when can an exception be detected and how 

35 will this detection be handled in the pipeline. The present invention operates utilizing a PE 
exception winch can cause an interrupt to the SP and the PE exception is based upon 
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conditions latched at the end of the CR phase. A whole cycle is allowed to propagate any 
exception signal from the PEs to the interrupt logic in the SP. Each PE is provided with an 
individual wire for the exception signal to be sent back to the SP where it is stored in an MRF 
register. These PE exception signals are also ORed together to generate a maskable PE 
5 exception interrupt. The cc flag represents the maskable PE exception interrupt signal. By 
reading the PE exception field in an MRF register, the SP can determine which PE or PEs 
have exceptions. Additional details relating to the PE exception are obtained by having the 
SP poll the PE causing an exception to find out the other information concerning the 
exception such as which data element in a packed operation caused the problem. This PE- 

10 local information is stored in a PE MRF register. One acceptable approach to resetting 
stored exception information is to reset it automatically whenever the values are read. 

In certain implementations, it is possible to make selectable the use of the SetCC 
register to either set the ACFs, cause an exception interrupt, or both for the programmed 
SetCC register specified condition. If the SetCC is enabled for exception interrupts and if the 

15 specified condition is detected, then an exception interrupt woxdd be generated from the PE 
or PEs detecting the condition. This exception interrupt signal is maskable. If SetCC is to be 
used for setting ACFs and generating exception interrupts, then, depending upon system 
requirements, two separate SetCC type registers can be defined in a more optimum manner 
for each intended use. When a single SetCC register is used for both ACF and exception 

20 interrupt, note that the exception cc is tested for every cycle while the F0 flag can only be set 
when an instruction is issued using Oil. CE3c encoding 1 102 as shown in Fig. 1 1. 

For detennining which instruction caused an exception interrupt, a history buffer in 
the SP is used containing a set number of instructions in the pipeline history so that the 
instruction that indirectly caused the PE exception can be determined. The number of history 

25 registers used depends upon the length of the instruction pipeline. A method of tagging the 
instructions in the history buffer to identify which one caused the exception interrupt is used. 
Even in SMIMD operation, this approach is sufficient since the contents of the VIM can be 
accessed if necessary. An ACF history buffer in each PE and the SP can also be used to 
determine which packed data element caused the exception. 

30 Alternatives for the Arithmetic Scalar Flag (ASF) Definition 

The definition of the C, N, V, Z flags, known collectively as the ASFs to be used in 
an exemplary system specifies the ASFs to be based on the least significant operation of a 
packed data operation. For single or one word (1W) operations, the least significant 
operation is the same as the single word operation. Consequently, the JMPcc instruction 

35 based on C, N, V, Z flags set by the IW operation is used regularly. Setting of the C, N, V, Z 
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flags by any other type of packed data operation in preparation for a JMPcc conditional 
branch is not always very useful so improving the definition of the ASFs would be beneficial. 

Improvements to the ASF definition addressed by the present invention are described 
below. The present C flag is replaced with a new version C* that is an OR of the packed data 
5 C flags. Likewise the N flag is replaced with a new version N' that is an OR of the packed 
data N flag?;, a V that is an OR of the packed data V flags : and a Z' that is an OR of the 
packed data Z flags. The OR function is based upon the packed data size, i.e. 4H word OR 
four flags and an 8B word OR eight. In the 1 W case, any existing code for an existing 
system which uses the JMPcc based upon 1W operations would also work in the new system 
10 and no change to the existing code would be needed. With the OR of the separate flags 

across the data types, some unique capabilities are obtained. For example, if any packed data 
result produced an overflow, a conditional JMP test could be easily done to branch to an error 
handling routine. 

In a first option, for JMPcc conditions based upon logical combinations of C\ N', V, 

15 and Z', the preceding operation would need to be of the i W single word type, otherwise the 
tested condition may not be very meaningful. To make JMPcc type operations based upon 
logical combinations of the ASF' flags more useful, a further change is required. The 
execution units which produce C, N, V, and Z flags must latch the individual packed data C, 
N, V, and Z flag information at the end of an instruction's execution cycle. In the condition 

20 return phase, these individually latched packed data C, N, V, and Z information flags are 
logically combined to generate individual packed data GT, LE, and the like signals. These 
individual packed data GT, LE, and the like, signals can then be ORcd into hot flag signals 
for use by the JMPcc type instructions. These OR conditions are shown in Figs. 14A and 
14B and are the same logical combinations used in the SetCC register specified conditions. 

25 Then, a JMPGT would branch, if "any" of the packed data operations resulted in a GT 

comparison. For example, following a packed data SUB instruction with a JMPGT becomes 
feasible. Rather than saving all packed data flags in a miscellaneous register file (MRF) 
register only the single hot flag state "cc" being tested for is saved in SCRO. Once the u cc" 
state has been latched in SCRO it can be used to cause an exception interrupt as defined 

30 further in the PE exception interrupt section below, if this interrupt is not masked. 

As an alternate second option, it is possible to define, for both Manta and ManArray 
approaches that only the 1 W case is meaningful for use with the JMPcc, CALLcc, and other 
conditional branch type instructions. By using the SetCC register and conditional execution 
with CE3b and CE3c, it will be possible to set the ACFs based upon a logical combination of 

35 the packed data ASFs and then use true (T.) or false (F.) forms of the JMP, CALL, and other 
conditional instructions to accomplish the same task. 
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For ManArray, the generic ASF is as follows: 
Arithmetic Scalar Flags Affected 

C = 1 if a carry occurs on any packed data operation, 0 otherwise, 

N = MSB of result of any packed data operation, 
5 V = 1 if an overflow occurs on any packed data operation, 0 otherwise, and 

Z = 1 if result is zero on any packed data operation, 0 otherwise. 
PE Exception Interrupts Alternative 

Rather than have each PE supply a separate exception wire, an alternative approach is 
to use a single wire that is daisy-chain ORed as the signal propagates from PE to PE, as 
10 shown for PEO-PEn for system 1560 of Fig. 15. In Fig. 15, a single line ORed exception 
signal and an exemplary signal flow are illustrated where the exception cc is generated in 
each PE assuming that cc=0 for no exception and cc=l for an exception. The exception cc is 
generated every instruction execution cycle as specified by the SetCC register. If multiple 
PEs cause exceptions at the same time, each exception is handled sequentially until all are 
15 handled. 

The PE addresses are handled in a similar manner as the single exception signal. An 
additional set of "n" wires for a 2 n array supplies the PE address. For example, a 4x4 array 
would require only five signal lines, four for the address and one for the exception signal. An 
exemplary functional view of suitable address logic 1600 for each PE in a 2x2 array is shown 

20 in Fig. 16. The logic 1600 is implemented using a 2x2 AND-OR, such as AND-ORs 1602 
and 1604 per PE address bit. 

With this approach, the PE closest to the SP on the chain will block PE exception 
addresses behind it until the local PE's exception is cleared. It is noted that if each PE can 
generate multiple exception types and there becomes associated with each type a priority or 

25 level of importance, then additional interface signals can be provided between PEs to notify 
the adjacent PEs that a higher priority exception situation is coming from a PE higher up in . 
the chain. This notification can cause a PE to pass the higher priority signals. In a similar 
manner, an exception interface can be provided that gives the exception type information 
along with the PE address and single exception signal. The exception types can be monitored 

30 to determine priority levels and whether a PE is to pass a signal to the next PE or not 
Debug Interrupt Processing 

There is a region of DSP instruction memory called an "interrupt vector table" (IVT) 
1701 and shown in Fig. 17 which contains a sequence of instruction addresses. For the 
exemplary system this table resides at instruction memory address 0x0000 through OxOOTF, 

35 where each entry is itself the 32-bit (4 byte) address of the first instruction to be fetched after 
the interrupt control unit accepts an interrupt signal conesponding to the entry. The first 
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entry at instruction memory address 0x0000 (1740) contains the address of the first 
instruction to fetch after RESET is removed. The third entry at instruction memory address 
0x0008 (1722) contains the address of the first instruction to be fetched when a debug 
interrupt occurs. Debug interrupts have the highest interrupt priority and are accepted at 
5 almost any time and cannot be masked. There are a few times at which a debug interrupt is 
not immediately acknowledged, such as when a Joad-VLIW (LV) instruction sequence is in 
progress, but there are few of these cases. There is a special table entry at instruction memory 
address 0x0004 (1720) in the exemplary system. 

This entry has a "shadow" register 1 800 associated with it called the Debug 

10 Instruction Register (DBIR) shown in Fig. 18. In addition, there are a set of control bits that 
are used to determine its behavior. Normally, in responding to an interrupt, a value is fetched 
from the IVT and placed into the program counter (PC) 1760, and it determines where the 
next instruction will be fetched. If a program branch targets an address in the IVT memory 
range, then the value fetched would be assumed to be an instruction and placed into the 

15 instruction decode register (IDR) 1750. Since the IVT contains addresses and not 

instructions, this would normally fail. However, in the case of address 0x0004, an instruction 
fetch targeting this address will cause the processor to attempt to fetch from its "shadow" 
register, the DBIR (if it is enabled). If there is au instruction in the DBIR, then it is read and 
placed into the IDR for subsequent decode. If there is not an instruction in the DBIR, the 

20 processor stalls immediately, does not advance the instructions in the pipeline, and waits for 
an instruction to be written to the DBIR. There are three control bits which relate to the 
DBIR. The debug instruction register enable (DBIREN) bit 1920 of the DSP control register 
(DSPCTL) 1900 shown in Fig. 19 when set to 1 enables the DBIR "shadow" register. If this 
bit is 0, then a fetch from 0x0004 will return the data from that instruction memory location 

25 with no special side-effects. Two other bits residing in the Debug Status Register 

(DBSTAT) 2000 of Fig. 20 are the "debug instruction present" (DBIP) bit 2030, and the 
"debug stall" (DBSTALL) bit 2020. The DBIP bit is set whenever a value is written to the 
DBIR either from the MCB or from the SPR bus. This bit is cleared whenever an instruction 
fetch from 0x0004 occurs (not an interrupt vector fetch). When this bit is cleared and an 

30 instruction fetch is attempted from 0x0004 then the DBSTALL bit of the DBSTAT register is 
set and the processor stalls as described above. When this bit is set and an instruction fetch is 
attempted, the contents of the DBIR. are sent to the IDR for decoding and subsequent 
execution. 

When the debug interrupt vector at instruction memory address 0x0008 is loaded 
35 with a value of 0x0004, and the DBIREN bit of the DSPCTL register is set to 1 (enabling the 
DBIR), then when a debug interrupt occurs, 0x0004 is first loaded into the PC (vector load) 
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and the next instruction fetch is attempted at address 0x0004. When this occurs, the 
processor either stalls (if DB IP = 0) or fetches the instruction in the DBIR and executes it. 
Using this mechanism it is possible to stop the processor pipeline (having saved vital 
hardware state when the interrupt is accepted) and have an external agent, a test module (or 
5 debugger function), take over control of the processor. 

As an additional note, on returning from any interrupt, at least one instruction is 
executed before the next interrupt vector is fetched, even if an interrupt is pending when the 
return-from-interrupt instruction (RETI) is executed. In the case where a debug interrupt is 
pending when the RETI instruction is executed, exactly one instruction is executed before 

10 fetching from the first address of the debug service routine (or from the DBIR if the vector is 
programmed to 0x0004). This behavior allows the program to be single-stepped by setting 
the debug interrupt request bit in the interrupt request register (ERR) while still in the debug 
interrupt handler. Then when the RETI is executed, a single instruction is executed before 
reentering the debug interrupt mode. 

15 Two additional registers along with two control bits are used during debug processing 

to allow a debug host or test module to communicate with debug code running in the target 
processor. The debug-data-out (DBDOUT) register 2100 of Fig. 21 and the debug-data-in 
(DBDIN) register 2200 of Fig. 22 are used for sending data out from the processor and 
reading data into the processor respectively. A write to the DBDOUT register causes a status 

20 bit, debug data output buffer full bit (DBDOBF) 2040 of the DBS TAT register to be set. 

This bit also controls a signal which may be routed to an interrupt on an external device (e.g. 
the test module or debug host). The complement of this signal is routed also to .an interrupt 
on the target processor so that it may use interrupt notification when data has been read from 
the DBDOUT register. The DBDOUT register is visible to MCB bus masters and when read, 

25 the DBDOBF bit to be cleared. An alternate read address is provided which allows the 

DBDOUT data to be read without clearing the DBDOBF bit. When an external debug host 
or test module writes to the DBDIN register, the debug data input-buff er-full bit (DBDIBF) 
2050 of the DBSTAT register is set. This bit also controls a signal which is routed to an 
interrupt on the processor target. The complement of this signal is available to be routed back 

30 to the debug host or test module as an optional interrupt source. When the target processor 
reads the DBDIN register, the DBDIBF bit is cleared. 

Given the preceeding background, the following discussion describes a typical debug 
sequence assuming that the debug interrupt vector in the IVT is programmed with a 0x0004 
(that is, pointing to the DBIR register) and the DBIR is enabled (DBIREN - 1). Fig. 23 

35 illustrates an exemplary DSP ManArray processor 23 10 residing on an MCB 2030 and an 
MDB 234. An external device which we will call the "test module" residing on the MCB, 
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initiates a debug interrupt on the target processor core. The test module is assumed be an 
MCB bus master supporting simple read and write accesses to slave devices on the bus. The 
test module actually provides an interface between some standard debug hardware (such as a 
JTAG port or serial port) and the MCB, and translates read/write requests into the MCB 
5 protocol. A debug interrupt may be initiated by writing to a particular MCB address, or 
configuring an instruction event point register described in further detail in U.S. Application 
Serial No. 09/598,566 to cause a debug interrupt when a particular DSP condition occurs 
such as fetching an instruction from a specified address, or fetching data from a particular 
address with a particular value. 

10 The processor hardware responds to the interrupt by saving critical processor state, 

such as the program status and control register, SCRO, and several other internal bits of state. 
The debug interrupt vector is fetched (having contents 0x0004) into the PC and then the 
processor attempts to read an instruction from 0x0004 causing an access to the DBIR 
register. If the DBIP bit of the DBSTAT register is 0, then the processor stalls waiting for an 

15 action from the test module. When the processor stalls, the DB STALL bit of the DBSTAT 
register is set to 1. This bit is also connected to a signal which may be routed (as an interrupt 
for example) to the test module. This is useful if an event point register is used to initiate the 
debug interrupt. Rather than polling the DBSTAT register, the test module may be 
configured to wait for the DBSTALL signal to be asserted. If the DBIP bit is set to 1, then 

20 the processor fetches the value in the DBIR and attempts to execute it as an instruction. 
Typically, the DBIR docs not have an instruction present when the debug interrupt is 
asserted, allowing the processor to be stopped. 

The debugger then reads a segment of the DSP instruction memory via the test 
module, and saves it in an external storage area. It replaces this segment of user program 

25 with a debug monitor program. 

The test module then writes a jump-direct (JMPD) instruction to the DBIR. When 
this occurs the DBIP bit is set, and the processor fetches this instruction into the IDR for 
decode, after which it is cleared again. The debugger design must make sure that no 
programmer visible processor state is changed until it has been saved through the test 

30 module. This JMPD instruction targets the debug monitor code. 

The monitor code is executed in such a way as to retain the program state. The 
DBDOUT register is used to write data values and processor state out to the test module 

To resume program execution, the test module writes state information back to the 
processor using the DBDIN register. When all state has been reloaded, the debug monitor 

35 code jumps to instruction address 0x0004 which results in a debug stall. 
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The test module lastly writes an RETT instruction to the DBIR which causes the 
internal hardware state to be restored and execution resumed in the program where it was 
interrupted. 

It will be noted that the debug sequence mentioned above could take place in several 
5 stages with successive reloads of instructions, using very little instruction memory. 

It should also be noted that it is possible to execute the state save/restore sequence by 
just feeding instructions through the DBIR. Doing this requires that the PC be "locked" , that 
is, prevented from updating by incrementing. This is done using a bit of the DSP control 
register (DSPCTL) called the "lock PC" (LOCKPC) bit 1930. When this bit is 1, the PC does 

10 is not updated as a result of instruction fetch or execution. This means when the LOCKPC 
bit is 1, branch instructions have no effect, other than updating the state of the user link 
register (ULR) (for CALL-lype instructions). Typically a small amount of instruction 
memory is used to "inject" a debug monitor program since this allows execution of state 
save/restore using loop instructions providing a significant performance gain. 

15 If a debug monitor is designed to be always resident in processor memory, when the 

debug interrupt occurs, it does not need to be directed to the DBIR, but rather to the entry 
point of the debug monitor code. 

Reset of the processor is carried out using the RESETDSP bi t 1940 of the DSPCTL 
register. Setting this bit to 1 puts the processor into a RESET state. Clearing this bit allows 

20 the processor to fetch the RESET vector from the IVT into die PC, the fetch the first program 
instruction from tins location. It is possible to enter the debug state immediately from 
RESET if the value 0x0004 is placed in the reset vector address (0x0000) of the IVT, and the 
DBIREN bit of the DSPCTL register is set to 1. This results in the first instruction fetch 
coming from the DBIR register. If no instruction is present then the processor waits for an 

25 instruction to be loaded. 

While the present invention is disclosed in a presently preferred context, it will be 
recognized that the teachings of the present invention may be variously embodied consistent 
with the disclosure and claims. By way of example, the present invention is disclosed in 
connection with specific aspects of the ManAnay architecture. It will be recognized that the 

30 present teachings may be adapted to other present and future architectures to which they may 
be beneficial, or to the ManArray architecture as it evolves in the future. 
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We claim: 

1 . A method of providing a hierarchy of interrupts for a plurality of processor 
interrupt modes comprising the steps of: 

establishing a plurality of processor mode programs at different priority modes of 
5 operation including (1) user mode programs at lowest priority, (2) system mode programs, 
(3) non-maskable interrupt mode programs and (4) debug mode programs at the highest 
priority; and 

utilizing a hierarchy of (1) general purpose interrupts (GP1) and system call interrupts 
(SYSCALL) at user mode, (2) non-maskable interrupts at user and system mode and (3) 
10 debug interrupts (DBI) at user, system and non-maskable interrupt mode to automatically 
cause transitions between the modes of operation. 

2. The method of claim 1 further comprising the step of utilizing hardware to 
automatically mask or disable interrupts at the same or a lower level once an interrupt is 
acknowledged and to record the processor modes in readable and writeable status and control 

15 registers. 

3. The method of claim 2 further comprising the step of reenabling a disabled 
interrupt at the same or a lower level once the acknowledged interrupt has completed. . 

4. The method of claim 3 further comprising the step of copying to memory 
registers which were saved by hardware when the interrupt was acknowledged. 

20 5. The method of claim 1 wherein a default rule is that GPI, SYSCALL, NMI 

and DBI may preempt a user mode program. 

6. The method of claim 5 wherein SYSCALL explicitly preempts a specified 
user mode program. 

7. The method of claim 1 wherein a default rule is that NMI and DBI may 
25 preempt a OPT program (TSR) running in the system mode. 

8. The method of claim 1 wherein a default rule is that DBI may preempt an 
NMI program (ISR). 

9. The method of claim 1 wherein a default rule is that GPI save status of a 
program counter, status flags and 2-cyclc instruction data registers when acknowledged. 

30 10. The method of claim 1 wherein a SYSCALL operates the same as a GPI from 

the standpoint of saving state and uses the same registers as the GPIs. 

1 1 . The method of claim 1 wherein the DBI save status and 2-cycle instruction 
data registers when they preempt user mode programs, but save only status information when 
they preempt GPT ISRs or NMI ISRs. 
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12. The method of claim 1 wherein the NMI save status but share the same 
hardware with GPI mode whereby NMI are non-recoverable, but the context in which they 
occur is saved. 

13. A method of initiating an interrupt comprising the steps of: 
5 executing a load instruction; and 

directly setting bits in an interrupt request register (IRR) that is located in a digital 
signal processor (DSP) interrupt control unit (ICU) in response to the load instruction. 

14. A method of initiating an interrupt comprising the steps of: 
executing a DSU COPY instruction; and 

10 directly setting bits in an interrupt request register (IRR) that is located in a digital 

signal processor (DSP) interrupt control unit (ICU) in response to the DSU COPY 
instruction. 

15. A method of initiating an interrupt comprising the steps of: 
executing on a BIT instruction; and 

15 directly setting bits in an interrupt request register (IRR) that is located in a digital 

signal processor (DSP) interrupt control unit (ICU) in response to the BIT instruction. 

16. A method of generating a SYSCALL interrupt comprising the steps of: 
establishing an argument to a SYSCALL instruction which is a vector number; and 
executing on the SYSCALL instruction to establish the SYSCALL interrupt. 

20 17. The method of claim 1 6 wherein the SYSCALL instruction is a control 

instruction which combines the features of a call instruction with those of an interrupt and the 
SYSCALL interrupt is a synchronous interrupt which operates at the same levels as general 
purpose interrupts (GPIs). 

18. The method of claim 16 wherein the vector number refers to an entry in a 
25 SYSCALL table which is located in a sequence processor (SP) memory. 

19. A method for invoking any interrupt type by writing to a particular address on 
a muster control bus (MCB) comprising the steps of: 

mapping an address to an address interrupt; 
writing to the address; 
30 detecting the write to the address mapped to the address interrupt; and 

asserting to a digital signal processor (DSP) core interrupt control unit the 
corresponding interrupt signal. 

20. A hardware system for providing interrupt forwarding registers comprising: 
a sequence processor (SP); 

35 at least one processing element (PE); 

a compute register file (GRF); 
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a plurality of functional units; and 

a condition generation unit (CGU); wherein when an interrupt occurs and is 
acknowledged, all instructions in the decode phase are allowed to proceed through execute; 
one-cycle instructions are allowed to complete and update their target registers and flags; and 
5 any two-cycle instructions are allowed to complete, but their output which may include 

output data, output register addresses and flag information is saved in a set of special purpose 
interrupt forwarding registers and no update is made to the CRF or status registers. 

2 1 . The apparatus of claim 20 wherein the hardware comprises multiple PEs and 
when an interrupt occurs interface signals are provided to all PEs to support operations 

10 independently in each PE dependent upon the local PE instruction sequence prior to the 
interrupt. 

22. The apparatus of claim 21 wherein there are different mixtures of 1 -cycle and 
2-cycle instruction in each PE at die time of the interrupt, and by using the signal interface 
and local information in each PE, the proper operation will occur in each PE on a return from 

15 the interrupt. 

23. The apparatus of claim 22 wherein interface signals include save/restore 
signals, interrupt signals, and extended or normal pipe status signals. 

24. The apparatus of claim 20 wherein the address of an instruction in a FETCH 
phase is saved to an appropriate link register. 

20 25. The apparatus of claim 20 wherein an interrupt: handler is invoked through a 

vector table and branch to target address. 

26. The apparatus of claim 20 wherein when a RETI instruction is executed, it 

causes a restoration of a saved save condition register (SCR0) and link address from 

appropriate link and saved-status registers. 
25 27. The apparatus of claim 20 wherein when an instruction at a link address 

reaches the EXECUTE phase, data in interrupt forwarding registers for those units whose last 

instruction prior to interrupt handling was a two-cycle instruction, is made available to the 

CRE and the CGU instead of data coming from a corresponding unit. 

28. A method for providing debug interrupt processing comprising the steps of: 
30 initiating external debugger program communication with a target program core 

through a master control bus (MCB) or through JTAG to a test module residing on the MCB; 

and 

initiating a debug interrupt on the target processor core utilizing the test module 
residing on a master control bus (MCB). 
35 29. The method of claim 28 further comprising the steps of: 
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storing an interrupt vector table including a debug vector containing an address of a 
debug instruction register (DBIR); and 

attempting an instruction fetch from the address of the DBIR causing a processor to 
enter a STALL state, and causing a status bit to be posted to a debug status register (DBTAT) 
5 to indicate a debug stall is in effect thereby allowing the test module to hook the processor. 

30. The method of claim 29 further comprising the steps of: 
detecting the debug stall bit set utilizing the test module; 
reading a section of instruction memory using MCB read accesses; 
saving the read section of instruction memory to an external location; and 

10 injecting debug monitor code into the read section of instruction memory. 

3 1 . The method of claim 30 further comprising the step of writing a JMPD 
instruction to the DBIR. 

32. The method of claim 3 1 wherein a direct address contained in the JMPD 
instruction points to the debug monitor code; and this sequence of steps causes the DBIP bit 

15 to be set in the DBSTAT, indicating to the hardware that an instruction is present in the 
DBIR, causing a fetch unit to retrieve this instruction and execute it. 

33. The apparatus of claim 20, wherein each PE further comprises a program 
settable SetCC register, SetCC decode logic, logic that combines flags as specified by the 
SetCC decode logic, and an interrupt signal interface from each PE to interrupt control logic 

20 in the SP for the purposes of specifying interrupts independently from each PE, collectively 
gathering PE interrupts in the interrupt control unit, and causing PE interrupts. 

34. A hardware system providing array conditional execution comprising: 
a sequence processor (SP); 

at least one processing element (PE); 
25 a plurality of functional units; 

a condition generation unit (CGU); 
a program settable SetCC register; 
SetCC decode logic; 

logic that combines flags as specified by the SetCC decode logic; and 
30 conditional execution control logic that allows arithmetic condition flags (ACFs) to 

be set when an instruction specifies execute and set ACFs as specified by the SetCC register. 

35. A hardware system providing conditional branch capability comprising: 
a sequence processor (SP); 

at least one processing element (PE); 
35 at least one execution unit supporting packed data operations; and 
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OR logic that combines side effects of individual packed data operations to create 
signal categories of side effect signals representing the OR of the individual packed data 
operation side effects. 

36. A method of providing a hierarchy of interrupts for a plurality of processor 
5 interrupt modes of an array processor comprising the steps of: 

establishing a plurality of processor mode programs at different priority modes of 
operation including (I) user mode programs at lowest priority, (2) system mode programs, 
(3) non-maskable interrupt mode programs and (4) debug mode programs at the highest 
priority; and 

10 utilizing a hierarchy of (1) general purpose interrupts (GPI) and system call interrupts 

(SYSCALL) at user mode, (2) non-maskable interrupts at user and system mode and (3) 
debug interrupts (DBI) at user, system and non-maskable interrupt mode to automatically 
cause transitions between the modes of operation. 

37. The method of 'claim 36 further comprising the step of utilizing hardware to 
15 automatically mask or disable interrupts at the same or a lower level once an interrupt is 

acknowledged and to record the processor modes in readable and writeable status and control 
registers. 

38. The method of claim 37 further comprising the step of reenabling a disabled 
interrupt at the same or a lower level once the acknowledged interrupt has completed. 

20 39. The method of claim 38 further comprising the step of copying to memory 

registers which were saved by hardware when the interrupt was acknowledged. 

40. The method of claim 36 wherein a default rule is that GPL SYSCALL, NMI 
and DBI may preempt a user mode program. 

41 . The method of claim 40 wherein SYSCALL explicitly preempts a specified 
25 user mode program. 

42. The method of claim 36 wherein a default rule is that NMI and DBI may 
preempt a GPI program (ISR) running in the system mode. 

43. The method of claim 36 wherein a default rule is that DBI may preempt an 
NMI program (ISR). 

30 44. The method of claim 36 wherein a default rule is that GPI save status of a 

program counter, status flags and 2-cycle instruction data registers when acknowledged. 

45. The method of claim 36 wherein a SYSCALL operates the same as a GPI 
from the standpoint of saving state and uses the same registers as the GPIs. 

46. The method of claim 36 wherein the DBI save status and 2-cycle instruction 
35 data registers when they preempt user mode programs, but save only status information when 

they preempt GPI ISRs or NMI ISRs. 
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47. The method of claim 36 wherein the NMI save status but share the same 
hardware with GP1 mode whereby NMI are non-recoverable, but the context in which they 
occur is saved. 

48. A method of initiating an interrupt in an array processor comprising the steps 

of: ■ 

executing a load instruction; and 

directly setting bits in an interrupt request register (IRR) that is located in a digital 
signal processor (DSP) interrupt control unit (ICU) in response to the load instruction. 

49. A method of initiating an interrupt in an array processor comprising the steps 

of: 

executing a DSU COPY instruction; and 

directly setting bits in an interrupt request register (IRR) that is located in a digital 
signal processor (DSP) interrupt control unit (ICU) in response to the DSU COPY 
instruction. 

50. A method of initiating an interrupt in an array processor comprising the steps 

of: 

executing on a BIT instruction; and 

directly setting bits in an interrupt request register (IRR) that is located in a digital 
signal processor (DSP) interrupt control unit (ICU) in response to the BIT instruction. 

51 . A method of generating a SYSCALL interrupt in an array processor 
comprising the steps of: 

establishing an argument to a SYSCALL instruction which is a vector number; and 
executing on the SYSCALL instruction to establish the SYSCALL interrupt. 

52. The method of claim 5 1 wherein the SYSCALL instruction is a control 
instruction which combines the features of a call instruction with those of an interrupt and the 
SYSCALL interrupt is a synchronous interrupt which operates at the same levels as general 
purpose interrupts (GPIs), 

53. The method of claim 5 1 wherein the vector number refers to an entry in a 
SYSCALL table which is located in a sequence processor (SP) memory. 

54. A method for invoking any interrupt type by writing to a particular address on 
a master control bus (MCB) of an array processor comprising the steps of: 

mapping an address to an address interrupt; 
writing to the address; 

detecting the write to the address mapped to the address interrupt; and 
asserting to a digital signal processor (DSP) core interrupt control unit the 
corresponding interrupt signal. 
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Mnemonic 


fete 
|Type 


Name j 


Description 


INTSRC 


jSPR 

1 


Cere Interrupt Select 1 
Register ■ | 


Selects between and external and internal source for an interrupt signal to ihe DSP ICU. 


ADIEN 


ISPR 
1 


Address Interrupt j 
Enable Register ! 


Enables address ranges for interrupt generation. 


IER 


1 


Interrupt Enable 
Register | 


Interrupt enableVdisable at the DSP core. 


IRR 


Imrf 

1 


Interrupt Request 
Register I 


patches pending interrupts to the DSP core. 


SCRO 


|mrf 


Status and Control 
Register 0 


Bits: GIE, NMIE and DBIE control interrupt enables for GPIs. NMI and Debug interrupt 
classes respectively. The 'ILVL* field indicates the current operating mode and is used to 
[determine RET1 operation (which registers arc selected for hardware restore). 


MRFXAR 


IMRF 


MRF" Extension 
Address Register 


|The value in this register determines the currently addressed MRF Extension Register. A bit ir 
[this register also enables auto-increment of the address. 


MRFXDR 


[MRF 


MRF Extension Data |[rhis register indirectly allows access to the MRF Extension register addressed by MRFXAR. 
Register || 
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ALU Interrupt ({Stores first 32 bits of ALU 2-cycle instruction write-data during interrupt processing 
Forwarding Register 0 || 


ALUIFR1 


|MRJFX 


ALU Interrupt IjStores second 32 bits of ALU 2-cycle instruction write-data during interrupt processing. 
Forwarding Register 1 || 


MAU1FR0 


jMRFX 


MAU Interrupt ||Stores first 32 bits of MAU 2-cycle instruction write-data during interrupt processing. 
Forwarding Register 0 (| 


MAU1FR1 


jMRFX 


MAU Interrupt llStores second 32 bits of MAU 2-cycle instruction write-data during interrupt processing. 
Forwarding Register 1 || 
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iDterrupt Forwarding llStores 1FR target register addresses and write-enablcs for 2-cyclc instructions during interrupt 
Register Address (processing. 
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Saved Status Register 0j(Stores first 32-biis of hot conditions for interrupt processing 


Status Saved 
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Saved Status Register 1 tptores second 32-bits of hot conditions for interrupt processing. 


Status Saved 
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[MRFX 


Saved Status Register 2 


jStores third 32-bits of hot conditions for interrupt processing. 
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Encoding 
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Effect on ACFs 


Example Instruction 


00 


Execute 


DO NOT AFFECT 


copy^iw R0, Rl 


0! 


Cond. Exec if F0 is Tnie 


DO NOT AFFECT 


T.copy.sd.w R0, R l 


10 


Cond. Exec if F0 is False 


DO NOT AFFECT 


F.copy.sd.w R0, Rl 


11 


Execute 


ACFe^-SctCC Flags 


Addcc. . . where cc is encoded in Setcc 
register as specified in Table A 
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000 


Execute 


Do Not Affect 


add.sa.lw R0, Rl, R2 


001 . 


Cond. Exec if FO is True 


Do Not Affect 


T.add.sa.lw R0, Rl, R2 ~ 


o:o 


Cond. Exec if FO is Fake 


Do Not Affect 


F.add.sa-lwRD,Rl,R2 


o:j 


Execute 


ACF 4-SetCC Flags 


Addcc... where cc is encoded in Selc 
registei as specified io fric^^A* & 


100 


Cond. Exec on Multiple flags determined by the number 
of dam elements iu the current insUuctiuii, if Fn is Trua 
operate on the coirespunding data element. 


Do Not Affect 


Tm.add.sa.4ri R0,R2,R4 


101 


Cond. Exec on Multiple flags determined by tiie number * 
of data elements in the current instruction, if Fn is False 
operate on the corresponding data element. 


Do Not Affect 


Fm.add3a.4h R0,R2,R4 


no 


Execute 


ACFs<-N 


SprecvN.pdw R0,Rl,2x2PEl 


in 


Execute 


ACFs<-Z 


SprecvZ.pd.w R0,Rl,2x2FEO 
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Setcc 
bits 


cc i 


Description j 


C-N-V-Z Setting of F0 and cc 
ex i*entiiin signal 


00000 


AVS | 


Any Overflow Set | 


cc, F0<-Size[V7 OR V6 OR . . . V0] 


00001 


AHS (CS) 


Any Higher or Same (unsigned, or Cany Set) | 


cc, F0«-Size[C7 OR CG OR .-. . CO] 


00010 


ANEG 


Any Negative j 


cc, F0<-Size[N7 OR N6 OR . . . NO] 


0001 1 


AZ (EQ) 


Any Zero or Equal j 


F0<-Size[Z7 OR Z6 OR . . . Z0] 


00100 


AHI 


Any Higher (unsigned) 


cc, F0<~ Size{[(C7=l) && (Z,7— u)J OR 

rrc6=n&& (Z6=o>i or ... r(co=n 

&& (Z0=0)]} 


00101 


[ ALS 


Any Lower or Same (unsigned) 


cc F0<- Size([(C7=0) j| (Z7=l)] OR 
[(C6=0) || (Z6=l)] OR ... 
[(CO=0)||(Z0=l)]> 


00110 


APOS 


Any Positive 


cc, F0+- Stze{[(N7=0) && (Z7=0)j OR 
[(N6=0) &&, (Z6=0)] OR ... 
[(N0=0)&&(Z0-0)]} 


00111 


AGT 


Any Greater -Than (signed) 


cc, F0«- Size{KZ7=0) && (N7=V7)] 
OR 

[(Z6=0) && (N6=V6)] OR . . . 
[(ZO=0)&&(N0=V0)]} 


01000 


ALE 


Any Less-than or Equal (signed) 


cc, F0<- Size{[(Z7=l) || (N7 != V7)] OR 
[(Z6=l)H(N6!-V6)]OR... 
[(Z0-1)H (NO ?-V0)]} 


01001 


ALT 


Any Lcss-Than (signed) 


cc, F0<- Size{[N7 1- V7] OR 
[N6 !-V6j OR... 
[N0 1-V0]} 


01010 




Reserved 




01011 




Reserved 




01100 




Reserved 




01101 




[Reserved 




01 1 10 




[Reserved 




01111 




[Reserved 
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Sercc 
bits 


cc 


Descrtptiou 

i 


C-N-V-Z Setting of ACFs 


toooo 


vs 


Overflow Set per operation 


Size{F7«-V7, F6<-V6, .... F0<-V0} 


1 0001 


HS (CS) 


Higher or Same (unsigned, or Carry Set) per 
operation 


Size{F7«-C7, F6<-C6,..., F0<-C0} 


10010 


NEG 


Negative per operation 


Size{F7<-N7, F6+-N6, F0<-N0} 


10011 


! Z(EQ) 


Zero or Equal per operation 


Size{F7<-Z7, F6<--Z6, F0<-Z0} 


10100 


HI 


Higher (unsigned) per operation 


Size{F7<- [(C7=l) && (Z7=0)], 
F6<- [(C6=l) && (Z6-0)], .... 
F0<- [(C0=1) && (Z0=0)]} 


10101 


LS 


Lower or Same (unsigned) per operation 


Size{F7«- [(C7=0) || (Z7=l)], 
F6^ [(C6-0) || (Z6=l)],..„ 
F0<-[(C0=0) ||(Z0=1)]) 


10110 


POS 


Positive per operation 


Size{F7<- [(N7=0) && (Z7=0)], 
F6<- [(N6=0) && (Z6=0)], 
! F0<- [(N0=0) && (Z0=0)]} 


10111 


GT 


Greater-Than (signed) per operation 


Size{F7<- [(Z7=0) && (N7-V7)], 
F6<- [(Z6=0) && (N6=V6)], . . ., 
rO<— |_(ZU=U/ ocol (INU— VUJJ j 


11000 


LE 


Less-than or Equal (signed) per operation 


Size{F7<- [(Z7=l) || (N7 != V7)], 
F6<- [(Z6=l) || (N6 != V6)], .... 
F0<- [(Z0=1)|| (NO N V0)]> 


11001 


LT 


Less-Than (signed) per operation 


Size{F7<-[N7 !=V7], 
F6<-[N6 !=V6], 
?0<-[N0 !=V0]} 


11010 




Reserved 




11011 




Reserved 




moo 




Reserved 




11101 




Reserved 




11110 




Reserved 




1 1 1 1 1 




Reserved 
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